Goto

Collaborating Authors

 selection policy





The Impact of Coreset Selection on Spurious Correlations and Group Robustness

Dharmasiri, Amaya, Yang, William, Kirichenko, Polina, Liu, Lydia, Russakovsky, Olga

arXiv.org Artificial Intelligence

Coreset selection methods have shown promise in reducing the training data size while maintaining model performance for data-efficient machine learning. However, as many datasets suffer from biases that cause models to learn spurious correlations instead of causal features, it is important to understand whether and how dataset reduction methods may perpetuate, amplify, or mitigate these biases. In this work, we conduct the first comprehensive analysis of the implications of data selection on the spurious bias levels of the selected coresets and the robustness of downstream models trained on them. We use an extensive experimental setting spanning ten different spurious correlations benchmarks, five score metrics to characterize sample importance/ difficulty, and five data selection policies across a broad range of coreset sizes. Thereby, we unravel a series of nontrivial nuances in interactions between sample difficulty and bias alignment, as well as dataset bias and resultant model robustness. For example, we find that selecting coresets using embedding-based sample characterization scores runs a comparatively lower risk of inadvertently exacerbating bias than selecting using characterizations based on learning dynamics. Most importantly, our analysis reveals that although some coreset selection methods could achieve lower bias levels by prioritizing difficult samples, they do not reliably guarantee downstream robustness.




Balancing Client Participation in Federated Learning Using AoI

Javani, Alireza, Wang, Zhiying

arXiv.org Artificial Intelligence

--Federated Learning (FL) offers a decentralized framework that preserves data privacy while enabling collaborative model training across distributed clients. However, FL faces significant challenges due to limited communication resources, statistical heterogeneity, and the need for balanced client participation. This paper proposes an Age of Information (AoI)- based client selection policy that addresses these challenges by minimizing load imbalance through controlled selection intervals. Our method employs a decentralized Markov scheduling policy, allowing clients to independently manage participation based on age-dependent selection probabilities, which balances client updates across training rounds with minimal central oversight. We provide a convergence proof for our method, demonstrating that it ensures stable and efficient model convergence. Specifically, we derive optimal parameters for the Markov selection model to achieve balanced and consistent client participation, highlighting the benefits of AoI in enhancing convergence stability. Through extensive simulations, we demonstrate that our AoI-based method, particularly the optimal Markov variant, improves convergence over the FedA vg selection approach across both IID and non-IID data settings by 7. 5% and up to 20%. Our findings underscore the effectiveness of AoI-based scheduling for scalable, fair, and efficient FL systems across diverse learning environments. Federated learning (FL), introduced by McMahan et al. [1], emerged as a solution to the limitations of traditional machine learning models that require centralized data collection and processing. FL enables client devices to collaboratively train a global model while keeping all the training data localized, thus addressing privacy concerns. Traditional machine-learning approaches require centralized data training in data centers, which often becomes impractical for edge devices due to privacy constraints in wireless networks and limited wireless communication resources. Federated learning overcomes these challenges by enabling devices to train machine learning models without data sharing and transmission, fulfilling the needs of data privacy and security. Compared to traditional distributed machine learning, Federated Learning introduces several challenges [2], including system heterogeneity from diverse device capabilities causing aggregation delays due to stragglers, statistical heterogeneity arising from non-IID and imbalanced client data affecting model convergence, and privacy concerns as exchanged model updates may inadvertently expose sensitive information. This paper is presented in part at the 2024 IEEE Global Communications Conference (Globecom). The authors are with the Center for Pervasive Communications and Computing, University of California, Irvine (e-mail: ajavani@uci.edu,


Optimizing Compound Retrieval Systems

Oosterhuis, Harrie, Jagerman, Rolf, Qin, Zhen, Wang, Xuanhui

arXiv.org Artificial Intelligence

Modern retrieval systems do not rely on a single ranking model to construct their rankings. Instead, they generally take a cascading approach where a sequence of ranking models are applied in multiple re-ranking stages. Thereby, they balance the quality of the top-K ranking with computational costs by limiting the number of documents each model re-ranks. However, the cascading approach is not the only way models can interact to form a retrieval system. We propose the concept of compound retrieval systems as a broader class of retrieval systems that apply multiple prediction models. This encapsulates cascading models but also allows other types of interactions than top-K re-ranking. In particular, we enable interactions with large language models (LLMs) which can provide relative relevance comparisons. We focus on the optimization of compound retrieval system design which uniquely involves learning where to apply the component models and how to aggregate their predictions into a final ranking. This work shows how our compound approach can combine the classic BM25 retrieval model with state-of-the-art (pairwise) LLM relevance predictions, while optimizing a given ranking metric and efficiency target. Our experimental results show optimized compound retrieval systems provide better trade-offs between effectiveness and efficiency than cascading approaches, even when applied in a self-supervised manner. With the introduction of compound retrieval systems, we hope to inspire the information retrieval field to more out-of-the-box thinking on how prediction models can interact to form rankings.


Pre-Training Meta-Rule Selection Policy for Visual Generative Abductive Learning

Jin, Yu, Liu, Jingming, Luo, Zhexu, Peng, Yifei, Qin, Ziang, Dai, Wang-Zhou, Ding, Yao-Xiang, Zhou, Kun

arXiv.org Artificial Intelligence

Visual generative abductive learning studies jointly training symbol-grounded neural visual generator and inducing logic rules from data, such that after learning, the visual generation process is guided by the induced logic rules. A major challenge for this task is to reduce the time cost of logic abduction during learning, an essential step when the logic symbol set is large and the logic rule to induce is complicated. To address this challenge, we propose a pre-training method for obtaining meta-rule selection policy for the recently proposed visual generative learning approach AbdGen [Peng et al., 2023], aiming at significantly reducing the candidate meta-rule set and pruning the search space. The selection model is built based on the embedding representation of both symbol grounding of cases and meta-rules, which can be effectively integrated with both neural model and logic reasoning system. The pre-training process is done on pure symbol data, not involving symbol grounding learning of raw visual inputs, making the entire learning process low-cost. An additional interesting observation is that the selection policy can rectify symbol grounding errors unseen during pre-training, which is resulted from the memorization ability of attention mechanism and the relative stability of symbolic patterns. Experimental results show that our method is able to effectively address the meta-rule selection problem for visual abduction, boosting the efficiency of visual generative abductive learning.


ASAP: Learning Generalizable Online Bin Packing via Adaptive Selection After Pruning

Fang, Han, Weng, Paul, Ban, Yutong

arXiv.org Artificial Intelligence

Recently, deep reinforcement learning (DRL) has achieved promising results in solving online 3D Bin Packing Problems (3D-BPP). However, these DRL-based policies may perform poorly on new instances due to distribution shift. Besides generalization, we also consider adaptation, completely overlooked by previous work, which aims at rapidly finetuning these policies to a new test distribution. To tackle both generalization and adaptation issues, we propose Adaptive Selection After Pruning (ASAP), which decomposes a solver's decision-making into two policies, one for pruning and one for selection. The role of the pruning policy is to remove inherently bad actions, which allows the selection policy to choose among the remaining most valuable actions. To learn these policies, we propose a training scheme based on a meta-learning phase of both policies followed by a finetuning phase of the sole selection policy to rapidly adapt it to a test distribution. Our experiments demonstrate that ASAP exhibits excellent generalization and adaptation capabilities on in-distribution and out-of-distribution instances under both discrete and continuous setup.